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Abstract. The 17th of the problems proposed by Steve Smale for the 
21st century asks for the existence of a deterministic algorithm comput- 
ing an approximate solution of a system of n complex polynomials in n 
unknowns in time polynomial, on the average, in the size A*' of the input 
system. A partial solution to this problem was given by Carlos Beltran 
and Luis Miguel Pardo who exhibited a randomized algorithm doing so. 
In this paper we further extend this result in several directions. Firstly, 
we exhibit a linear homotopy algorithm that efficiently implements a 
non-constructive idea of Mike Shub. This algorithm is then used in 
a randomized algorithm, call it LV, a la Beltran-Pardo. Secondly, we 
perform a smoothed analysis (in the sense of Spielman and Teng) of 
algorithm LV and prove that its smoothed complexity is polynomial in 
the input size and cr"^, where a controls the size of of the random per- 
turbation of the input systems. Thirdly, we perform a condition-based 
analysis of LV. That is, we give a bound, for each system /, of the ex- 
pected running time of LV with input /. In addition to its dependence 
on A this bound also depends on the condition of /. Fourthly, and to 
conclude, we return to Smale's 17th problem as originally formulated 
for deterministic algorithms. We exhibit such an algorithm and show 
that its average complexity is jv'^''°^'°^'^'. This is nearly a solution to 
Smale's 17th problem. 
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1. Introduction 

In 2000, Steve Smale published a list of mathematical problems for the 
21st century [29]. The 17th problem in the list reads as follows: 

Can a zero of n complex polynomial equations in n unknowns be 
found approximately, on the average, in polynomial time with a 
uniform algorithm? 

Smale pointed out that "it is reasonable" to homogenize the polynomial 
equations by adding a new variable and to work in projective space af- 
ter which he made precise the different notions intervening in the question 
above. We provide these definitions in full detail in Section [2j Before doing 
so, in the remainder of this section, we briefly describe the recent history of 
Smale's 17th problem and the particular contribution of the present paper. 
The following summary of notations should suffice for this purpose. 
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We denote by Tid the linear space of complex homogeneous polynomial 
systems in n + 1 variables, with a fixed degree pattern d = (di, . . . ,dn)- 
We let D = maxjdj, N = dimc^d) and V = Y\-di. We endow this space 
with the unitarily invariant Bombieri-Weyl Hermitian product and consider 
the unit sphere S{T-ld) with respect to the norm induced by this product. 
We then make this sphere a probability space by considering the uniform 
measure on it. The expression "on the average" refers to expectation on 
this probability space. Also, the expression "approximate zero" refers to 
a point for which Newton's method, starting at it, converges immediately, 
quadratically fast. 

This is the setting underlying the series of papers [22l [231 IMl ESI [26] — 
commonly referred to as "the Bezout series" — written by Shub and Smale 
during the first half of the 1990s, a collection of ideas, methods, and results 
that pervade all the research done in Smale's 17th problem since this was 
proposed. The overall idea in the Bezout series is to use a linear homotopy. 
That is, one starts with a system g and a zero C of g and considers the 
segment Ef g with extremities / and g. Here / is the system whose zero 
we want to compute. Almost surely, when one moves from g to /, the 
zero C of g follows a curve in projective space to end in a zero of /. The 
homotopy method consists of dividing the segment Ef g in a number, say k, 
of subsegments Ei small enough to ensure that an approximate zero Xi of 
the system at the origin of Ei can be made into an approximate zero Xj+i 
of the system at its end (via one step of Newton's method). The difficulty 
of this overall idea lies in the following issues: 

(1) How does one choose the initial pair {g, C)? 

(2) How does one choose the subsegments Ei? In particular, how large 
should k be? 

The state of the art at the end of the Bezout series, i.e., in [26j, showed 
an incomplete picture. For (2), the rule consisted of taking a regular sub- 
division of Ef^g for a given k, executing the path-following procedure, and 
repeating with k replaced by 2k if the final point could not be shown to be 
an approximate zero of / (Shub and Smale provided criteria for checking 
this). Concerning (1), Shub and Smale proved that good initial pairs {g,C) 
(in the sense that the average number of iterations for the rule above was 
polynomial in the size of /) existed for each degree pattern d, but they could 
not exhibit a procedure to generate one such pair. 

The next breakthrough took a decade to come. Beltran and Pardo pro- 
posed in [H [5] that the initial pair (g, Q should be randomly chosen. The 
consideration of randomized algorithms departs from the formulation of 
Smale's 17th proble but it is widely accepted that, in practical terms, 

^In his description of Problem 17 Smale writes "Time is measured by the number of 
arithmetic operations and comparisons, <, using real machines (as in Problem 3)" and in 
the latter he points that, "In [Blum-Shub-Smale,1989] a satisfactory definition [of these 
machines] is proposed." The paper [9] quoted by Smale deals exclusively with deterministic 
machines. Furthermore, Smale adds that "a probability measure must be put on the space 
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such algorithms are as good as their deterministic sibhngs. And in the case 
at hand this departure turned out to pay off. The average (over /) of the 
expected (over (g, Q) number of iterations of the algorithm proposed in [5] is 
0{n^N'^D^ logV). One of the most notable features of the ideas introduced 
by Beltran and Pardo is the use of a measure on the space of pairs {g, Q 
which is friendly enough to perform a probabilistic analysis while, at the 
same time, does allow for efficient sampling. 

Shortly after the publication of |4j ^5] Shub wrote a short paper of great 
importance |21j . Complexity bounds in both the Bezout series and the 
Beltran-Pardo results rely on condition numbers. Shub and Smale had in- 
troduced a, mcasiirG of condition /inorm 

(/, C) for / G and C e C"+i which, 
in case C is a zero of /, quantifies how much varies when / is slightly per- 
turbed. Using this measure they defined the condition number of a system 
/ by taking 

(1.1) (/) := ^max^^norm(/,C)- 

The bounds mentioned above make use of an estimate for the worst-condi- 
tioned system along the segment Ef^g, that is, of the quantity 

(1.2) max /imax(g)- 

The main result in [21] shows that there exists a partition of Ef^g which 
successfully computes an approximate zero of / whose number k of pieces 
satisfies 

(1.3) k<CD^/^ j l4{q)dq, 

where C is a constant and lJ-2{q) is the mean square condition number of q 
given by 

CI<?{C)=o 

This partition is explicitly described in [21], but no constructive procedure 
to compute the partition is given there. 



of all such /, for each d = {di, . . . ,dn), and the time of an algorithm is averaged over the 
space of /." Hence, the expression 'average time' refers to expectation over the input data 
only. 
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In an oversight of this non-constructibihty, Beltran and Pardo [6j provided 
a new version of their randomized algorithnu with an improved complexity 
of 0(1)3/2 ^AT). 

A first goal of this paper is to validate Beltran and Pardo's analysis in [6] 
by exhibiting an efficiently constructible partition of Ef^g which satisfies a 
bound like ()1.3I) . Our way of doing so owes much to the ideas in [2l|. The 
path-following procedure ALH relying on this partition is described in detail 
in §3.11 together with a result, Theorem 13.11 bounding its complexity as 
in (fOj) . 

The second goal of this paper is to perform a smoothed analysis of a 
randomized algorithm (essentially Beltran-Pardo randomization plus ALH) 
computing a zero of /, which we call LV. What smoothed analysis is, is 
succinctly explained in the citation of the Godel prize 2008 awarded to its 
creators, Daniel Spielman and Teng Shan 

Smoothed Analysis is a novel approach to the analysis of algo- 
rithms. It bridges the gap between worst-case and average case 
behavior by considering the performance of algorithms under a 
small perturbation of the input. As a result, it provides a new 
rigorous framework for explaining the practical success of algo- 
rithms and heuristics that could not be well understood through 
traditional algorithm analysis methods. 

In a nutshell, smoothed analysis is a probabilistic analysis which replaces 
the 'evenly spread' measures underlying the usual average-case analysis (uni- 
form measures, standard normals, . . . ) by a measure centered at the input 
data. That is, it replaces the 'average data input' (an unlikely input in 
actual computations) by a small random perturbation of a worst-case data 
and substitutes the typical quantity studied in the average-case context, 

by 

sup E_ 

7 /~C(/,r) 

Here (/?(/) is the function of / one is interested in (e.g., the complexity of an 
algorithm over input /), 7^ is the 'evenly spread' measure mentioned above 
and C(/,r) is an isotropic measure centered at / with a dispersion (e.g., 
variance) given by a (small) parameter r > 0. 



*^The algorithm in [6] explicitly calls as a subroutine "the homotopy algorithm of |21) " 
without noticing that the partition in [2T] is non-algorithmic. Actually, the word 'algo- 
rithm' is never used in [211 . The main goal of [21], as stated in the abstract, is to motivate 
"the study of short paths or geodesies in the condition metric" — the proof of (|1.3|l does 
not require the homotopy to be linear and one may wonder whether other paths in T-L^ may 
substantially decrease the integral in the right-hand side. This goal has been addressed, 
but not attained, in [7]. As of today it remains a fascinating open problem. 

^See http : / /www. fmi .uni-stuttgart .de/ti/personen/Diekert/citationOS .pdf | for 
the whole citation 
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An immediate advantage of smoothed analysis is its robustness with re- 
spect to the measure C (see §3.41 below). This is in contrast with the most 
common critique to average-case analysis: "A bound on the performance of 
an algorithm under one distribution says little about its performance un- 
der another distribution, and may say little about the inputs that occur in 
practice" [ST] . 

The precise details of the smoothed analysis we perform for zero finding 
are in §3.4[ 

To describe the third goal of this paper we recall Smale's ideas of complex- 
ity analysis as exposed in p8]. In this program-setting paper Smale writes 
that he sees "much of the complexity theory [. . . ] of numerical analysis 
conveniently represented by a two-part scheme." The first part amounts to 
obtain, for the running time time(/) of an algorithm on input /, an estimate 
of the form 



where K, c are positive constants and is a condition number for /. The 
second takes the form 



"where a probability measure has been put on the space of inputs." The first 
part of this scheme provides understanding on the behavior of the algorithm 
for specific inputs / (in terms of their condition as measured by ^(/)). The 
second, combined with the first, allows one to obtain probability bounds for 
time(/) in terms of size(/) only. But these bounds say little about time(/) 
for actual input data /. 

Part one of Smale's program is missing in the work related with his 17th 
problem. All estimates on the running time of path-following procedures for 
a given / occurring in both the Bezout series and the work by Beltran and 
Pardo are expressed in terms of the quantity in (II. 2p or the integral in (jl.Sp , 
not purely in terms of the condition of /. We fill this gap by showing for 
the expected running time of LV a bound like (11. Sh with /u(/) = ^max(/)- 
The precise statement. Theorem 13.71 is iii §3.61 below. 

Last but not least, to close this introduction, we return to its opening 
theme: Smale's 17th problem. Even though randomized algorithms are ef- 
ficient in theory and reliable in practice they do not offer an answer to the 
question of the existence of a deterministic algorithm computing approxi- 
mate zeros of complex polynomial systems in average polynomial time. The 
situation is akin to the development of primality testing. It was precisely 
with this problem that randomized algorithms became a means to deal with 
apparently intractable problems [301 [T7|. Yet, the eventual display of a de- 
terministic polynomial-time algorithm [1] was justly welcomed as a major 
achievement. The fourth main result in this paper exhibits a deterministic 
algorithm computing approximate zeros in average time 
so we design and analyze a deterministic homotopy algorithm, call it MD, 



(1.5) 



time(/)<i^(size(/) + M/)r 



(1.6) 



Prob{/x(/) > T} < T-" 



ON A PROBLEM POSED BY STEVE SMALE 



7 



whose average complexity is polynomial in n and N and exponential in D. 
This already yields a polynomial-time algorithm when one restricts the de- 
gree D to be at most n^~^ for any fixed e > (and, in particular, when 
D is fixed as in a system of quadratic or cubic equations). Algorithm MD 
is fast when D is small. We complement it with an algorithm that uses a 
procedure proposed by Jim Renegar [18] and which computes approximate 
zeros similarly fast when D is large. 

In order to prove the results described above we have relied on a number of 
ideas and techniques. Some of them — e.g., the use of the coarea formula or 
of the Bombieri-Weyl Hermitian inner product — are taken from the Bezout 
series and are pervasive in the literature on the subject. Some others — 
notably the use of the Gaussian distribution and its truncations in Euclidean 
space instead of the uniform distribution on a sphere or a projective space — 
are less common. The blending of these ideas has allowed us a development 
which unifies the treatment of the several situations we consider for zero 
finding in this paper. 

Acknowledgments. Thanks go to Carlos Beltran and Jean-Pierre Dedieu 
for helpful comments. We are very grateful to Mike Shub for constructive 
criticism and insightful comments that helped to improve the paper consid- 
erably. This work was finalized during the special semester on Foundations 
of Computational Mathematics in the fall of 2009. We thank the Fields 
Institute in Toronto for hospitality and financial support. 

2. Preliminaries 

2.1. Setting and Notation. For d G N we denote by Hd the subspace 
of C[Xq, . . . ,Xn] of homogeneous polynomials of degree d. For / G Tia we 
write 




where a = (oq, . . . , a„) is assumed to range over all multi-indices such that 
— ^22=0'^^ = d, (^) denotes the multinomial coefficient, and AT" := 
X^oj^ai . . . X"" . That is, we take for basis of the linear space Tid the 

1 /2 

Bombieri-Weyl hasis consisting of the monomials (^) A". A reason to do 
so is that the Hermitian inner product associated to this basis is unitarily 

1 /2 

invariant. That is, if 5 G Hd is given by g{x) = J2a ia) ^aX", then the 
canonical Hermitian inner product 

if, a) = X] 

\a\=d 

satisfies, for all element v in the unitary group U{n + 1), that 

if, 9) = {f°^,9° 
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Fix di, . . . , d„ € N \ {0} and let Tid = Tidi x . . . x T-l(i„ be the vector space of 
polynomial systems / = (/i, . . . , /„) with fi G C[Xq, . . . , Xn] homogeneous 
of degree dj. The space Tid is naturally endowed with a Hermitian inner 
product {f,g) = J27=iifi^ 9i) • denote by ||/|| the corresponding norm of 

Recall that N = dimc'Hd and D = maxjdj. Also, in the rest of this 
paper, we assume D > 2 (the case D = 1 being solvable with elementary 
linear algebra). 

Let := F{C^~^^) denote the complex projective space associated to 
£^n+i g^j^j S{T-Ld) the unit sphere of Tid- These are smooth manifolds that 
naturally carry the structure of a Riemannian manifold (for P" the metric is 
called Fubini-Study metric). We will denote by dp and ds their Riemannian 
distances which, in both cases, amount to the angle between the arguments. 
Specifically, for x,?/ G P" one has 

(2.1) coscip(x,y) = ^^. 

\m\ \\y\\ 

Ocasionally, for f,g £ Hd \ {0}, we will abuse language and write ds{f,g) 
to denote this angle, that is, the distance (i§(n^, u||y)- 
We define the solution variety to be 

Vp := {(/, C) G -Hd X P" I / / and /(C) = 0}. 

This is a smooth submanifold of Tid x P" and hence also carries a Riemannian 
structure. We denote by Vp{f) the zero set of / G Tid in P*^. By Bezout's 
Theorem, it contains P points for almost all /. Let Df{C,)\^rp^ denote the 
restriction of the derivative of /: C"^"'^ — t- C" at C, to the tangent space 
:= {v G C""*"^ I (f , Cl = 0} of P" at Q. The subvariety of ill-posed pairs is 
defined as 

S^:={(/,C) G^pI rankZ)/(C)|T, <n}. 

Note that (/, Q 5]p means that C is a simple zero of /. In this case, by the 
implicit function theorem, the projection Vp — Hd, {g, x) ^ g can be locally 
inverted around (fX)- The image S of Hp under the projection Vp — )• Tid 
is called the discriminant variety. 

2.2. Newton's Method. In [20], Mike Shub introduced the following pro- 
jective version of Newton's method. We associate to / G Tid (with Df{x) 
of rank n for some x) a map Nj : C"^^ \ {0} — t- C"^^ \ {0} defined (almost 
everywhere) by 

Nfix) = x-Df{x)^^lf{x). 

Note that Nf{x) is homogeneous of degree in / and of degree 1 in x so 
that Nf induces a rational map from P" to P"" (which we will still denote 
by Nf) and this map is invariant under multiplication of / by constants. 

We note that Nf{x) can be computed from / and x very efficiently: since 
the Jacobian Df{x) can be evaluated with 0{N) arithmetic operations [3j, 
one can do with a total of 0{N + n^) arithmetic operations. 
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It is well-known that when x is sufficiently close to a simple zero C of /, 
the sequence of Newton iterates beginning at x will converge quadratically 
fast to C- This property lead Steve Smale to define the following intrinsic 
notion of approximate zero. 

Definition 2.1. By an approximate zero of f £ Tid associated with a zero 
E P" of / we understand a point x € P" such that the sequence of Newton 
iterates (adapted to projective space) 

Xi+i := Nf{xi) 

with initial point xq := x converges immediately quadratically to (, i.e., 

/1\ 2'-i 
dwixiX) < dTp{xo,C) 

for all i £ N. 

2.3. Condition Numbers. How close need x to be from to be an ap- 
proximate zero? This depends on how well conditioned the zero C is- 

For / S Tid and x S C"^^ \ {0} we define the (normalized) condition 
number finormif,x) by 



(/,x):= 11/11 (D/(x)|tJ 'diag(\/^l 



where Tx denotes the Hermitian complement of Cx, the right-hand side 
norm is the spectral norm, and diag(aj) denotes the diagonal matrix with 
entries Oj. Note that /Unomi(/, a^") is homogeneous of degree in both ar- 
guments, hence it is well defined for (/, x) E Tid x P". If x is a simple 
zero of /, then kerD/(x) = Cx and hence (-D/(x)|y^) can be identified 
with the Moore-Penrose inverse L»/(x)t of Df{x). In this case we have 
/inorm(/,2;) > 1, cf. [H §12.4, Cor. 3]. 

The following result (essentially, a 7-Theorem in Smale's theory of esti- 
mates for Newton's method [27]) quantifies our claim above. 

Theorem 2.2. Assume /(C) = and d-p{x,() ^ n3/2 "° — 777T where uq := 
3 - \/7 0.3542. Then X is an approximate zero of f associated with 

Proof. This is an immediate consequence of the projective 7-Theorem in [HI 
p. 263, Thm. 1] combined with the higher derivative estimate [HJ p. 267, 
Thm. 2]. □ 

2.4. Gaussian distributions. The distribution of input data will be mod- 
elled with Gaussians. Let x G M."" and cj > 0. We recall that the Gaussian 
distribution A^(x, fi^I) on with mean x and covariance matrix a^l is given 
by the density 

1 / ||x-x|Px 



2^2 
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3. Statement of Main Results 

3.1. The Homotopy Continuation Routine ALH. Suppose that we are 
given an input system / G Hd and an initial pair {g, () in the solution va- 
riety Vp such that / and g are M-Iinearly independent. Let a = ds{f,g). 
Consider the line segment Ef^g in T-Ld with endpoints / and g. We parame- 
terize this segment by writing 

Ef^g = {qrend\Te [0,1]} 

with Qt being the only point in Ef^g such that ds{g,qr) = ra (see Fig- 
ure [T]). Explicitly, we have = + (1 — where t = t(r) is given 
by Equation (15. 4p below. If Ef^g does not intersect the discriminant vari- 
ety S, there is a unique continuous map [0, 1] — )• Vp, r i— )• (qt, Ct) such that 
(qoXo) = (5jC)j called the lifting of Ef^g with origin (gX)- order to 
find an approximation of the zero of / = qi we may start with the zero 
C = Co of g = qo and numerically follow the path {qr^r) by subdividing 
[0, 1] with points = tq < ri < • • • < = 1 and by successively computing 
approximations Xj of by Newton's method. 

More precisely, we consider the following algorithm ALH (Adaptive Linear 
Homotopy) with the stepsize parameter A = 7.53 • 10~^. 

Algorithm ALH 

input f,g GTid and C G such that g{C) = 
" := dsif^a), r ■■= 11/11, s := \\g\\ 
T -.= 0, q := g, x:=C 
repeat 

r := min{l, r + At} 

t ■= « 

r sin a cot (to)— r cos a+s 

q:=tf + {l-t)g 

X := Nq{x) 
until r = 1 
RETURN X 

Our main result for this algorithm, which we will prove in Section HI is 
the following. 

Theorem 3.1. The algorithm ALH stops after at most k steps with 

k< 217 D^/^dn{f,g) [ /iLm(9r, Cr) dr. 
Jo 

The returned point x is an approximate zero of f with associated zero Ci . 

Remark 3.2. 1. The bound in Theorem 13.11 is optimal up to a constant 
factor. This easily follows by an inspection of its proof given in 21 
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2. Algorithm ALH requires the computation of /Unorm which, in turn, re- 
quires the computation of the operator norm of a matrix. This cannot be 
done exactly with rational operations and square roots only. We can do, 
however, with a sufficiently good approximation of fJ-normil^^) ^-nd there 
exist several numerical methods efficiently computing such an approxima- 
tion. We will therefore neglect this issue pointing, however, for the sceptical 
reader that another course of action is possible. Indeed, one may replace 
the operator by the Frobenius norm in the definition of /inorm and use the 
bounds ||M|| < ||M||i? < yrarik(M)||M|| to show that this change preserves 
the correctness of ALH and adds a multiplicative factor n in the right-hand 
side of Theorem 13.11 A similar comment applies to the computation of a 
and cot (ra) in algorithm ALH which cannot be done exactly with rational 
operations. 

3.2. Randomization and Complexity: the Algorithm LV. ALH will 

serve as the basic routine for a number of algorithms computing zeros of 
polynomial systems in different contexts. In these contexts both the input 
system / and the origin (g, (") of the homotopy may be randomly chosen: in 
the case of {g, () as a computational technique and in the case of / in order 
to perform a probabilistic analysis of the algorithm's running time. 

In both probability measure is needed: one for / and one for the 

pair (g, C). The measure for / will depend on the kind of probabilistic ana- 
lysis (standard average-case or smoothed analysis) we perform. In contrast, 
we will consider only one measure on Vp — which we denote by pst — for 
the initial pair {g,C)- It consists of drawing g from T-L^ from the standard 
Gaussian distribution (defined via the isomorphism — given by 
the Bombieri-Weyl basis) and then choosing one of the (almost surely) D 
zeros of g from the uniform distribution on {1, . . . The formula for the 
density of pst will be derived later, see Lemma [8.8f 5). The above procedure 
is clearly non-constructive as computing a zero of a system is the problem 
we wanted to solve in the first place. One of the major contributions in 
was to show that this drawback can be repaired. The following result (a 
detailed version of the effective sampling in [6]) will be proved in Section [9] 
as a special case of more general results we will need in our development. 

Proposition 3.3. We can compute a random pair (g, Q £ Vp according to 
the density pst with 0{N) choices of random real numbers from the standard 
Gaussian distribution and 0{DnN + n^) arithmetic operations (including 
square roots of positive numbers). 

Algorithms using randomly drawn data are called probabilistic (or ran- 
domized). Those that always return a correct output are said to be of type 
Las Vegas. The following algorithm (which uses Proposition 13. 3p belongs to 
this class. 
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Algorithm LV 
input / e Tid 

draw {g, C) £ Vp from pst 

run ALH on input (/, g, Q 

For an input / € Hd algorithm LV either outputs an approximate zero x 
of / or loops forever. By the running time t{f, g, C) we will understand the 
number of elementary operations (i.e., arithmetic operations, evaluations 
of the elementary functions sin, cos, cot, square root, and comparisons) 
performed by LV on input / with initial pair {g,C)- For fixed /, this is a 
random variable and its expectation t{f) := E(g^()r^p^^{t{f,g,C)) is said to 
be the expected running time of LV on input /. 

For all /, g, (^q, the running time t{f, g, (^) is given by the number of itera- 
tions K(f, g, (^) of ALH with input this triple times the cost of an iteration, 
the latter being dominated by that of computing one Newton iterate (which 
is 0{N + n^) independently of the triple (/, ff, C)) see §2.2p . It therefore 
follows that analyzing the expected running times of LV amounts to do so 
for the expected value — over {g, () £ Vp drawn from pst — of K{f, g, (). We 
denote this expectation by 

K{f):= E {K{f,g,0). 

(9>C)~Pst 

3.3. Average Analysis of LV. To talk about average complexity of LV re- 
quires specifying a measure for the set of inputs. The most natural choice 
is the standard Gaussian distribution on Tid- Since K{f) is invariant un- 
der scaling, we may equivalently assume that / is chosen in the unit sphere 
S{Tid) from the uniform distribution. With this choice, we say a Las Vegas 
algorithm is average polynomial time when the average — over / G S{T-Ld) — 
of its expected running time is polynomially bounded in the size N oi f. 
The following result shows that LV is average polynomial time. It is essen- 
tially the main result in [6] (modulo the existence of ALH and with specific 
constants) . 

Theorem 3.4. The average of the expected number of iterations of Algo- 
rithm LV is bounded as (n > A) 

E K{f) < 3707 D^/^N{n + l). 

3.4. Smoothed Analysis of LV. A smoothed analysis of an algorithm con- 
sists of bounding, for all possible input data /, the average of its running 
time (its expected running time if it is a Las Vegas algorithm) over small 
perturbations of /. To perform such an analysis, a family of measures (pa- 
rameterized by a parameter r controlling the size of the perturbation) is 
considered with the following characteristics: 

(1) the density of an element / depends only on the distance ||/ — /||. 

(2) the value of r is closely related to the variance of ||/ — /||. 
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Then, the average above is estimated as a function of the data size N and 
the parameter r, and a satisfying result, which is described by the expres- 
sion smoothed polynomial time, demands that this function is polynomially 
bounded in and A^. Possible choices for the measures' family are the 
Gaussians N{f\a'^\) (used, for instance, in [HI [I9l [32l [33] ) and the uniform 
measure on disks B{f,r) (used in [21 [TTl US])- Other families may also be 
used and an emerging impression is that smoothed analysis is robust in the 
sense that its dependence on the chosen family of measures is low. This 
tenet was argued for in [15] where a uniform measure is replaced by an ad- 
versarial measure (one having a pole at /) without a significant loss in the 
estimated averages. 

In this paper, for reasons of technical simplicity and consistency with the 
rest of the exposition, we will work with truncated Gaussians defined as 
follows. For / G Tid and a > we shall denote by N{f, cr^I) the Gauss- 
ian distribution on ~ M^'^ (defined with respect to the Bombieri-Weyl 
basis) with mean / and covariance matrix cr^I. Further, for A > let 
PA,a ■■= Prob{||/|| < ^ I / ~ N{0,a^l)}. We define the truncated Gauss- 
ian A^^(/, cr^I) with center / € Tid as the probability measure on Ti^ with 
density 

(3,1) p(/) = (%? ■t|l/-7ll£-^ 

otherwise, 

where pj ^ denotes the density of N{f, a^I). Note that NA{f, cr^I) is isotropic 
around its mean /. 

For our smoothed analysis we will take A = \/2N. In this case, we have 
PA,a > ^ for ah o- < 1 (Lemma [SH). Note also that Var(||/ - 7||) < cj^, so 
that any upper bound polynomial in o"~^ is also an upper bound polynomial 
in Var(||/-7||)-i. 

We can now state our smoothed analysis result for LV. 

Theorem 3.5. For any < a < 1, Algorithm LV satisfies 

sup E_ K{f) < 3707 D^/^{N + 2~^/^VN){n + I)-. 

7eS{Wd)/~^A(/,T2i) o- 

3.5. The Main Technical Result. The technical heart of the proof of 
the mentioned results on the average and smoothed analysis of LV is the 
following smoothed analysis of the mean square condition number. 

Theorem 3.6. Let q G Tid and a > 0. For q € fid drawn from N{q,a^l) 
we have 

fl4iQ)_\ < ein + 1) 



We note that no bound on the norm of q is required here. Indeed, using 
/i2(Ag) = fJ'2iQ), it is easy to see that the assertion for q,a implies the 
assertion for Xq, Xa, for any A > 0. 
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3.6. Condition-based Analysis of LV. We are here interested in estimat- 
ing K{f) for a fixed input system / G S(T-Ld)- Such an estimate will have to 
depend on, besides A^, n, and D, the condition of /. We measure the latter 
using Shub and Smale's [22J ^max(/) defined in (II. ip . Our condition-based 
analysis of LV is summarized in the following statement. 

Theorem 3.7. The expected number of iterations of Algorithm LV with input 
f G S(T-Ld) \ S is bounded as 

< 157109 D3Ar(^ + lKax(/)- 

3.7. A Near Solution of Smale's 17th Problem. We finally want to 
consider deterministic algorithms finding zeros of polynomial systems. Our 
goal is to exhibit one such algorithm working in nearly-polynomial average 
time, more precisely in average time 

^©(logiogTV)^ A first ingredient to do 
so is a deterministic homotopy algorithm which is fast when D is small. 
This consists of algorithm ALH plus the initial pair (U,zi), where U = 
(Ui, ...,Un)£ SiTid) with Ui = ^(^0* - ) and zi = (1 : 1 : . . . : 1). 
We consider the following algorithm MD (Moderate Degree): 
Algorithm MD 
input / G Tid 

run ALH on input (/, [/, Zi) 

We write Kjj{f) := K{f, U, zi) for the number of iterations of algorithm 
MD with input /. We are interested in computing the average over / of 
Kjj{f) for / randomly chosen in S{'Hd) from the uniform distribution. 

The complexity of MD is bounded as follows. 

Theorem 3.8. The average number of iterations of Algorithm MD is bounded 
as 

E Ktt(/) < 314217 L»^A(n + l)^+^ 
/eS{Wd) 

Algorithm MD is efficient when D is small, say, when D < n. For D > n 
we use another approach, namely, a real number algorithm designed by Jim 
Renegar [18] which in this case has a performance similar to that of MD 
when D < n. Putting both pieces together we will reach our last main 
result. 

Theorem 3.9. There is a deterministic real number algorithm that on in- 
put f G Hd computes an approximate zero of f in average time A^'^(^°siog^)^ 
where N = dim T-Ld measures the size of the input f . Moreover, if we restrict 
data to polynomials satisfying 

D < or D > n^^^, 

for some fixed e > 0, then the average time of the algorithm is polynomial 
in the input size N. 
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4. Complexity Analysis of ALH 
The goal of this section is to prove Theorem 13. 1[ An essential component 

in this proof is 3-11 GStimatC of how iniich. /Xnorm 

(/, C) changes when / or C 
(or both) are slightly perturbed. The following result gives upper and lower 
bounds on this variation. It is a precise version, with explicit constants, of 
Theorem 1 of [2T]. 

Proposition 4.1. Assume D > 2. Let < e < 0.13 be arbitrary and 
C<^. For all f,g€ SCHd) and all x,C G P", ifdif,g) < Di/2^Zm{f,0 
and die, x) < ^o,^ ^ — then 

Z ' /^norm 

{g^x) < fi 

norm norm 



In what follows, we will fix the constants as e = 0.13 and C = = 0.025. 

Remark 4.2. The constants C and e implicitly occur in the statement of 
Theorem 13.11 since the 217 therein is a function of these numbers. But their 
role is not limited to this since they also occur in the algorithm ALH in the 
parameter A = ^[^^'^3 controlling the update r + At of r. We note that for 
the former we could do without precise values by using the big Oh notation. 
In contrast, we cannot talk of a constructive procedure unless all of its steps 
are precisely given. 

Proof of Theorem \3.1[ Let = tq < ti < . . . < = 1 and (q = xo,xi, . . . ,Xk 
be the sequences of r-values and points in P" generated by the algorithm 
ALH. To simplify notation we write qi instead of g^, and Ci instead of (n- 

We claim that, for i = 0, . . . ,k — 1, the following inequalities are true: 

C 



(a) d]p{xi,Ci) < 



/, N /^norm('?i) /■ /■ \ ^ ri , \ ( 

^> (TTe) l^normiqiXi) < (1 + ^jMnorm Wi, 

c 

(c) dsiqi,qi+i) < 



(d) dp(Ci,Cm)< 

(e) dp{xi,Ci+i)< 



norm 

c (I 



norm 

2C 



norm 

We proceed by induction showing that 

(a, i) =^ (b, i) =^> ((c, i) and (d, i)) =^ (e, i) =^ (a, i + 1) 
Inequality (a) for i = is trivial. 
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Assume now that (a) holds for some i < k — 1. Then, Proposition 14.11 
(with f = g = Qi) impHes 

A'^norm(g^^ Xj) ( /- \ <■ j_ p-\ ( \ 

^-|^ ^ ^-^ Mnorm {Qii ^i) _i \^ ' £ j/^norm \Qii ^i) 

and thus (b). We now show (c) and (d). To do so, let t^: > Ti be such that 
Ir-iMrW + llCr||)c^T = z)3/2^^^^(g.^^.) or ^ = 1, whichever the smallest. 
Then, for all t £ [Tj,T*], 



dHQXt) = / \\Cr\\dr< / (11,^,11 + llC^IDdr 
and, similarly. 



norm norm 

It is therefore enough to show that Tj+i < r=f. This is trivial if r* = 1. 
We therefore assume r* < 1. The two bounds above allow us to apply 
Proposition 14.11 and to deduce, for all r G [Tj,r^,], 

A'norm norm 
From llCrll < fJ-normiqrXr) WQtW (cf. P, §12.3-12.4]) and Unormigr, Ct) > 1 it 

follows that 

C (1-e) 



norm 

< 2(1 +e)/inorm(gi,Cj) / UAdT < 2<i§(gi, g'T-J(l + e)/^norm(Q'i, Ci)- 

J Ti 

Consequently, using (b), we obtain 

C(l-e) C(l-e) 



The parameter A in ALH is chosen as 2(i+ey'' slightly less). By the 



definition of Tj+i — Tj in ALH we have a(Tj+i — Tj) = 2 1 r- So we 

^ Mnorm Wii^^ij 

obtain 

dsiqi,qT^) > a(Ti+i - Ti) = dn{qi,qi+i). 

This implies Tj+i < r* as claimed and hence, inequalities (c) and (d). With 
them, we may apply Proposition 14.11 to deduce, for all r G [rj,rj+i], 

(4.1) < ^norm(gr, Cr) < (1 + s) fJ-norm{qi, Ci)- 
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Next we use the triangle inequality, (a), and (d), to obtain 

dp(xj,Ci+i) < dp{xi,Ci) + df{Ci,Ci+i) 

C C (1-e) 

< TT^TT^ : — + 



2C 

which proves (e). Theorem 12.21 vields that Xj is an approximate zero of gj+i 
associated with its zero Ci+i- Indeed, by our choice of C and e, we have 
2C < no(l + e) and hence dp{xi,Ci+i) < r.3/2,, "° a ^ • Therefore, x^+i = 
^qi+i{xi) satisfies 

dp{xi+i,Ci+i) < ^(ip(xi,0+i)- 
Using (e) and the right-hand inequality in (I4.ip with t = fj+i, we obtain 

C C 
dp(xi+i,0+i) < Tj— — 7 7T ^ 



which proves (a) for i + 1. The claim is thus proved. 

The estimate dp{xf:, (j.) < -^jg-zj — — ft J^^^ shown for i = k — 1 implies 
by Theorem 12.21 that the returned point Xk is an approximate zero oi = f 
with associated zero Ci- 

Consider now any i G {0, . . . , A; — 1}. Using (j4.ip and (b) we obtain 

..2 /„ /■ \ j_ \ / ^norm ^norm 

l^normiqr^CrjdT > J ^ dv = ^ (r^+i - Ti) 

(1 + eY aL>3/2/inorm(9M a^i) 

A 1 1 

> 



(l + e)4aZ)3/2 - 217aZ)3/2- 



This implies 



JO 



Aovm{qr,C,T)dT > 



k 1 



,0 217aD3/2' 
which proves the stated bound on A;. □ 



5. A Useful Change of Variables 



We first draw a conclusion of Theorem l3.1l that we will need several times. 
Recall the definition (|1.4p of the mean square condition number /X2(g). 



18 PETER BURGISSER AND FELIPE CUCKER 

Proposition 5.1. The expected number of iterations o/ALH on input f E 
Tid \T, is bounded as 

i^(/)< 2171)3/2 jE (ds{f,g) [\l{qr)dT) . 

g&SCHd) \ Jo J 

Proof. Fix g G such that the segment Ef^g does not intersect the dis- 
criminant variety S (which is the case for almost all 5, as / S). To each 
of the zeros C^*^ of g there corresponds a lifting [0,1] -^V,t\-^ {QtXt^) of 
Ef^g such that Cq*^ = C*'*''- Theorem O states that 

K{f,g,C^'^) < 217 D'/Us{f,g) [\loUQr,C^'^)dT. 

Jo 

Since Ct^\ ■ ■ ■ , Ct^^ are the zeros of Qr, we have by the definition ()1.4p of the 
mean square condition number 

(5-1) ^5]K(/,5,C(^)) < 2171)3/2 ds(/,5) / ^i(<?.)dr. 

i=i -^0 

The assertion follows now from (compare the forthcoming Lemma W. 



K{f)= E {K{f,g,0)= E (^Y.^(f^9,(^'^)] 



□ 



The remaining of this article is devoted to prove Theorems l3.4H3.9i All 
of them involve expectations — over random / and/or g — of the integral 
/o M2(^T")^''"- cases, we will eventually deal with such an expectation 

with / and g Gaussian. Since a linear combination (with fixed coefficients) 
of two such Gaussian systems is Gaussian as well, it is convenient to pa- 
rameterize the interval Ef g by a parameter t G [0, 1] representing a ratio of 
Euclidean distances (instead of a ratio of angles as r does). Thus we write, 
abusing notation, qt = tf + {1 — t)g. For fixed t, as noted before, qt follows 
a Gaussian law. For this new parametrization we have the following result. 

Proposition 5.2. Let f,g G Tid be M-linearly independent and tq G [0,1]. 

Then 



j\l{qr)dT< C\\f\\\\g\\f^dt, 

J TO J to \m\\ 

\\g 



Mf^g) 

where 

*^ + ||/||(sinacot(roa) — cosa) 

is the fraction of the Euclidean distance ||/ — 5|| corresponding to the fraction 
Tq of the angle a = ds{f,g). 

Proof. For t E [0, 1], abusing notation, we let qt = tf + (1 — t)g and r(t) G 
[0, 1] be such that r(i)a is the angle between g and qt- This defines a bijective 
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map [to, 1] — ^ [to, 1], t I— )• T{t). We denote its inverse by r i— )• t{T). We claim 
that 

ON _ sing ||/|| • ||g|| 

^ ^ ^ dt " a llg^f • 

Note that the stated inequahty easily follows from this claim by the trans- 
formation formula for integrals together with the bound sin a < 1. 

To prove Claim (j5.2p . denote r = ||/|| and s = \\g\\. We will explicitly 
compute t(r) by some elementary geometry. For this, we introduce cartesian 
coordinates in the plane spanned by / and g and assume that g has the 
coordinates (s,0) and / has the coordinates (r cos a, r sin a), see Figure [TJ 




Then, the lines determining have the equations 

cos (ra) r cos a — s 

X = y — r and x = y h s 

sin(Ta) rsma 

from where it follows that the coordinate y of qr is 

rssinasin(ra) 

(5.3) y = — r — — ^. 

r sm a cos(ra) — r cos a sm(Taj + s sm(Ta) 

Since ^(t) = ^ ^ it fohows that 

(5.4) t{T) = — -4 . 

r sm a cot(TQj — r cos a + s 



This implies the stated formula for to = ^(to). Differentiating with respect 
to r, using (15. Sp and sin(ra) = we obtain from (15. 4p 

dt ars sin a 

dr (r sin a cos(ra) — r cos a sin(Ta) + s sin(Ta))^ 

ay^ ^ 0'\\Qt{T)f 

rssin^(ra) sina rssina 

This finishes the proof of Claim (|5.2p . □ 
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In all the cases we will deal with, the factor ||/|| \\g\\ will be easily bounded 
and factored out the expectation. We will ultimately face the problem of 

estimating expectations of for different choices of and at- This is 

achieved by Theorem 13.61 stated in ^3.51 



We derive here from Theorem 13.61 our main results on the average and 
smoothed analysis of LV stated in ^ The proof of Theorem l3.6l is postponed 
to Sections [T]- [3 

6.1. Average-case Analysis of LV (proof). To warm up, we first prove 
Theorem 13.41 which illustrates the blending of the previous results in a 
simpler setting. 



In the following we set j4 := \/2N and write Pa,(t = ProbjU/H < A | / ~ 

iV(0,a2l)} for > 0. 

Lemma 6.1. We have PA,a > ^ for all < a < 1. 

Proof. Clearly it suffices to assume a = 1. The random variable is 
chi-square distributed with 2N degrees of freedom. Its mean equals 2N. 
In \13\ Corollary 6] it is shown that the median of a chi-square distribution 
is always less than its mean. □ 

Proof of Theorem \3.4\ We use Proposition 15.11 to obtain 



The equality follows from the fact that, since both ds{f,g) and ff^i'ir) are 
homogeneous of degree in both / and g, we may replace the uniform 
distribution on S(7id) by any rotationally invariant distribution on Hd, in 
particular by the centered truncated Gaussian Na{0, I) defined in (13. ip . Now 
we use Proposition 15.21 (with tq = 0) to get 



Denoting by po,i the density of A^(0, 1), the right-hand side of (j6.ip equals 



6. Analysis of LV 





(6.1) 
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where the last equahty follows from the fact that, for fixed t, the random 
polynomial system qt = tf + (1 — t)g has a Gaussian distribution with law 
N{{),a1l), where := + (1 — t)^. Note that we deal with nonnegative 
integrands, so the interchange of integrals is justified by Tonelli's theorem. 
By Lemma |6. II we have -£-<SN. 

We now apply Theorem 13.61 to deduce that 

' E (i3^\dt<''^''^^^ f'^ _e7r(n + l) 



<?t~Af(o,a2i) V Iktll J 2 * + (1 - tr 4 

Consequently, 

E K{f) < 217 £>^/^ • 8N ■ ^'^^^ < 3707 D^/^N{n + 1). □ 
/e5(Wd) 4 

Remark 6.2. The proof (modulo the existence of ALH) for the average com- 
plexity of LV given by Beltran and Pardo in [6] differs from the one above. 
It relies on the fact (elegantly shown by using integral geometry arguments) 
that, for all r G [0, 1], when / and g are uniformly drawn from the sphere, 
so is gr/ll^r II- The extension of this argument to more general situations ap- 
pears to be considerably more involved. In contrast, as we shall shortly see, 
the argument based on Gaussians in the proof above carries over, mutatis 
mutandis, to the smoothed analysis context. 

6.2. Smoothed Analysis of LV (proof). The smoothed analysis of LV is 
shown similarly to its average-case analysis. 

Proof of Theorem \3.5[ Fix / G S{T-Ld)- Reasoning as in the proof of Theo- 
rem [321 and using 11/11 < 11/11 + 11/ -711 < 1 + we show that 



E Kif) < 217Dy^^4±^ E E ( 4^ dt) 

f~NA{f,(yH) PA,aPA,l f^N{f,(jH)9^N{0,l) \Jo llQtW J 

= 2170'/^^±iM /■ E f4M) * 

Pa,uPa,\ Jo qt^N{qt,a^l) \ \\qt\\ J 

with = tj and = (1 - tf + aH"^ . We now apply Theorem 13.61 to deduce 
' E f MM^ < <^ + ^) f' dt _e7r(n + l) 



gt'^N{qt,(7fl) 



qtW^ J 2 Jo {l-t)^+aH^ 4c7 



Consequently, using Lemma l6.ll we get 



E K{f) < 217 L>3/2 . 4 . (2Ar + ^""^"^ ^ 



which proves the assertion. □ 

The next two sections are devoted to the proof of Theorem 13.61 First, 
in Section [71 we give a particular smoothed analysis of a matrix condition 
number (Proposition 17. ill . Then, in Section [HI we reduce Theorem 13.61 to 
this smoothed analysis of matrix condition numbers. 
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7. Smoothed Analysis of a Matrix Condition Number 

In the following we fix A G C"^", a > and denote by p the Gaussian 
density of A^(yl, cj^I) on C"^". Moreover, we consider the related density 

(7.1) p{A) = c-^\deiA\^ p{A) where c := E (| det 

The following result is akin to a smoothed analysis of the matrix condition 
number k,{A) = ||j4|| • with respect to the probability densities p that 

are not Gaussian, but closely related to Gaussians. 

Proposition 7.1. We have^A^p{\\A''^f) < 

The proof is based on ideas in Sankar et al. pj^i §3], see also [10]. We will 
actually prove tail bounds from which the stated bound on the expectation 
easily follows. 

We denote by S""^ := {C G C" | ||C|| = 1} the unit sphere in C". 
Lemma 7.2. For any v G S"^-*^ and any t > we have 

FToh\\\A~\\\ >t} < 

Proof. We first claim that, because of unitary invariance, we may assume 
that u = e„ := (0, . . . , 0, 1). To see this, take 5 G U{n) such that v = Sen- 
Consider the isometric map A B = S~^A which transforms the density 
p{A) to a density of the same form, namely 

p{B) =piA) = c~^\detA\'^p{A) = c-^\det B\'^p'{B), 

where p'{B) denotes the density of N{S^^A,a'^l) and c = Ep(|detAp) = 
Ep'(|detl?p). Thus the assertion for e„ and random B (chosen from any 
isotropic Gaussian distribution) implies the assertion for v and A, noting 
that A-^v = B-^en- This proves the claim. 

Let Oj denote the ith row of A. Almost surely, the rows ai, . . . , a^-i are 
linearly independent. We are going to characterize ||^~^en|| in a geometric 
way. Let Sn := spanjoi, . . . , a„_i} and denote by a:^ the orthogonal projec- 
tion of On onto S^. Consider w := A~^en, which is the nth column of A~^. 
Since AA~^ = I we have {w, a^) = for i = 1, . . . , n — 1 and hence w G S:^. 
Moreover, {w,an) = 1, so \\w\\ \\a:^\\ = 1 and we arrive at 

(7.2) P"'e„|| = 



Let An G C^"^^-*^" denote the matrix obtained from A by omitting an- 
We shall write vol(^„) = det(^^*)^/^ for the (n — l)-dimensional volume 
of the parallelepiped spanned by the rows of An- Similarly, | det^| can be 
interpreted as the n-dimensional volume of the parallelepiped spanned by 
the rows of A. 
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Now we write p{A) = pi{An) p2{cLn) where pi and p2 are the density 
functions of A^(A„,(T^I) and N(an,cr'^T), respectively (the meaning of An 
and a„ being clear). Moreover, note that 

vo\{Af = vo\{An)^ \\a^f. 
Fubini's Theorem combined with (|7.2p yields for t > 

[ vo\{A)^p{A)dA = [ VO\{An)^ pi{An) 

|2 , 



(7.3) • / Wa^W P2{an)dan ] dAn- 

\-J\K\\<i/t J 

We next show that for fixed, linearly independent ai, . . . , a^-i and A > 
(7-4) / \\anfp2{an)dan<—. 

For this, note that ~ N{a:l^,aH) in S"^ ~ C where the orthogonal 
projection of a„ onto S^- Thus, proving (j7.4p amounts to showing 



\z\<\ 



A^ 

\z\'^p^{z)dz < ^ 



for the Gaussian density Pziz) = ^' of z & C, where z € C. 

Clearly, it is enough to show that 

f . ^ , A2 

/ Pz{z)dz < —. 
J\z\<x 2o-^ 

Without loss of generality we may assume that z = 0, since the integral in 
the left-hand side is maximized at this value of z. The substitution z = aw 
yields dz = a'^dw {dz denotes the Lebesgue measure on M?) and we get 

/ / ll||2 I \ 1 2 

/ po{z)dz = / — e~2l^l dw = — e~2^' 27rrdr 
J\z\<x Jh<^ -^0 27r 

A 



-e 2^ 







I — e ^ < — 7T- 
- 2^2 



which proves inequality (j7.4p . 
A similar argument shows that 

(7.5) 2cj2 < I \z\^ (}z{z)dz = I Wa^W'^ p2{an) dan- 



Plugging in this inequality into ()7.3p (with t = 0) we conclude that 

(7.6) 2cj2 E (vol(A„)2) < E(vol(A)2). 

pi p 
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On the other hand, plugging in ([73]) with A = J into 1^^, we obtain 

/ _ yo\{Afp{A)dA < E{yo\{Anf). 

-/ ll-A II 

Combined with (j7.6p this yields 



yo\{Afp{A)dA < ^E(vol(A)2). 



By the definition of the density p, this means that 

1 



Prob||U"^e„|| > t} < 



which was to be shown. □ 

Lemma 7.3. For fixed u G S"^^^, < s < 1, and random v uniformly 
chosen in S"^^ we have 



Probjlu'^vl > s} = (1 - s^) 



2\n-l 



Proof. Recall the Riemannian distance dp in P" ^ := P(C") from (j2.ip . 
Accordingly, for < < 7r/2, we have 

Probjln 7;| > cos^j = ^Jo^^ " ^^^^ ^ ' 

where the last equality is due to [llj Lemma 2.1]. □ 
Lemma 7.4. For any t > we have 

e2(n + l)2 1 



Prob 



,b{M-li>t} < 



1" H - J - ;Lg^4 ^4- 

Proof We use an idea in Sankar et al. |19|, §3]. For any invertible A G 
there exists u G such that = For almost all A, the 

vector u is uniquely determined up to a scaling factor 6 of modulus 1. We 
shall denote by ua a representative of such u. 

The following is an easy consequence of the singular value decomposition 
of ||A~^||: for any v G S"~^ we have 

(7.7) W^^'^vW > • 

We choose now a random pair (A, v) with A following the law p and, in- 
dependently, V G S"^^ from the uniform distribution. Lemma 17.21 implies 
that 

(n + l)2 



Prob<^ |U~^w|| > tA/ > < 
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On the other hand, we have by (17. 7|) 
Prob|||A-iv|| > tA/2/(n + 1)1 

A,v L J 

> Prob|p~^||>t and \u1v\ > v^2/(n + 1)1 

A,v L J 

> Probjp^^ll > t| Probjlu^ul > ^/2/{n + l) | p^^H > t|. 

Lemma 17.31 tells us that for any fixed u G S"^^ we have 

Prob jl^i^ ?;| > V2/(ra + 1)} = (1 - 2/(n + 1))""^ > e'^ 



the last inequality as (^)""^ = (1 + ^)""^ < e^- We thus obtai 



am 



Prob I II > i} < Prob|||A-iT;|| > t^— ^| 



^ e^(n + l)^ 
16cj4t4 ' 

as claimed. □ 
Proof of Proposition [TT/j By Lemma 17.41 we obtain, for any Tq > 0, 

/•oo 

]E(p-i||2)=/ Prob{p~if > r}(ir 

< To + J^^Proh{\\A-Y>T}dT < To + ^, 
g /^^ dT = T-i. Now choose Tq = □ 



usm. 



8. Smoothed Analysis of the Mean Square Condition Number 
The goal of this section is to accomplish the proof of Theorem 13. 6i 

8.1. Orthogonal decompositions of T-L^. For reasons to become clear 
soon we have to distinguish points in P" from their representatives C in the 
sphere = {C S C"+i | ||C|| = 1}. 

For C G S" we consider the subspace Ri^ of Hd consisting of all systems h 
that vanish at C of higher order: 

R(; := {he-Hdl KC) = 0, Dh{C) = 0}. 

We further decompose the orthogonal complement of R/^ in T-L^ (defined 
with respect to the Bombieri-Weyl Hermitian inner product). Let denote 
the subspace of R^ consisting of the systems vanishing at C and let 
denote its orthogonal complement in R^. Then we have an orthogonal 
decomposition 

(8.1) -Hd = C(; ® Lt; ® R^ 

parameterized by C G 
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Lemma 8.1. The space Cq consists of the systems {ci{X,()'^') with Ci G C. 
The space consists of the systems 

where ii is a linear form vanishing at C. Moreover, if ii = Y^^=o''^ij-^j with 
M = (niij), then \\g\\ = \\M\\f. 

Proof. By unitary invariance it suffices to verify the assertions in the case 
= (1, 0, . . . , 0). In this case this follows easily from the definition of the 
Bombieri-Weyl inner product. □ 

The Bombieri-Weyl inner product on T-L^^ and the standard metric on the 
sphere define a Riemannian metric on T-Ld x S" on which the unitary 
group U{n + 1) operates isometrically. The "lifting" 

y :={((?, C)G-HdxS- I g(C)=0} 

of the solution variety Vp is easily seen to be aZ//(n+l)-invariant Riemannian 
submanifold of "^^d x 

The projection tt2: V ^ S", {q, C) C defines a vector bundle with fibers 
:= Tr^^{C). In fact, ([HI]) can be interpreted as an orthogonal decomposi- 
tion of the trivial Hermitean vector bundle Tid x — t- S" into subbundles 
C, L, and R over S". Moreover, the vector bundle V is the orthogonal sum 
of L and R: we have = © i?,^ for all C- 

In the special case where all the degrees di are one, Hd can be identified 
with the space ^ := C"^'^""'"^) of matrices and the solution manifold V 
specializes to the manifold 

:= {(M,C) G X I MC = 0}. 

The map 7r2 specializes to the vector bundle p2 - W ^ S'^, (M, C) ^ C with 
the fibers 

W(; := {M £ ^ \ MC = 0}. 
Lemma l8. II tells us that for each we have isometrical linear maps 

(8.2) ^L^, gM,c := {Vdl {X,Cy^~' E.^ijXj). 

In other words, the Hermitean vector bundles W and L over are isometric. 
The fact that the map ()8.2p depends on the choice of the representative of C 
forces us to work over instead over P". (All other notions introduced so 
far only depend on the base point in P".) 

We compose the orthogonal bundle projection = © i?^ — with 
the bundle isometry ~ VF^ obtaining the map of vector bundles 

(8.3) ^-.V^W, {gM,c + h,C)^ (M, C) 
with fibers ^'"-'^(M, C) isometric to ii^. 

Lemma 8.2. We have ^'(g,C) = (A"^-Dg(C),C) where A := d\ag{^/(^) . 
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Proof. Let {q,C) ^ ^ and (M, (") := "^{q,C)- Then we have the decom- 
position q = + gM,c + h £ C,^ L,^ (B R(. It is easily checked that 
DgMxiC) = AM. Since Dq{C) = DgM,({0 we obtain M = A-^Dq{C). □ 

The lemma shows that the condition number Hnorm{q, C) (cf- ^2.3p can be 
described in terms of ^ as follows: 

(8.4) = ll^^ll' where (M,C) = ^(^7,0- 

8.2. Outline of proof of Theorem 13.61 Let py^^ denote the density of 
the Gaussian N{q, cj^I) on H^, where q G Hd and a > 0. For fixed C G S" 
we decompose the mean q as 

according to (j8.ip . If we denote by pc^, Pl^, and p/j^ the densities of the 
Gaussian distributions in the spaces Cq, L^, and i?^ with covariance matrices 
cr^I and means A;^, M^^, and /i^, respectively, then the density /O-^^ factors as 

(8.5) puA^ + g + h) = pc^{k) ■ pL^ig) ■ pR^{h). 

The Gaussian density p^^ on induces a Gaussian density pw^^ on the 
fiber with the covariance matrix cj^I via the isometrical linear map (j8.2p . 
so /3vy^(M) = pL^{gM,c)- 

We derive now from the given Gaussian distribution p^^ on a prob- 
ability distribution on V as follows (naturally extending pst introduced in 
§3.2p . Think of choosing (q, Q at random from V by first choosing q G Tid 
from N{q, cr^I), then choosing one of its V zeros [C,] G P"' at random from the 
uniform distribution on {1, . . . ,T>}, and finally choosing a representative C 
in the unit circle [C] H S" uniformly at random. (An explicit expression of 
the corresponding probability density pv on V is given in (j8.23p .) 

The plan to show Theorem l3.6l is as follows. The forthcoming Lemma [8.81 
tells us that 

(8.6) E f = E 

where E-^d and Ey refer to the expectations with respect to the distribu- 
tion A''(g, (7^1) on 'Ka and the probability density py on respectively. 
Moreover, by Equation ()8.4p . 

Ef^%4f^) = E(||Mtf), 

where E.^ denotes the expectation with respect to the pushforward den- 
sity pj( oi Py with respect to the map pi o : V ^ M (for more on push- 
forwards see §8.3p . 
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Of course, we need to better understand the density p^^. Let M G ^ be 
of rank n and ^ G with MC, = 0. The following formula 

(8.7) p^(^M)=pc^iO)-^ [ pw,,{M)dS\ 

can be heuristically explained as follows. We decompose a random q G 
according to the decomposition = Cc_® Lc_® as q = k + g + h. Choose 
A G C with |A| = 1 uniformly at random in the unit circle. Then we have 
^{Qi'^O = (-^) ^C) iff A; = and g is mapped to M under the isometry 
in (j8.2p . The probability density for the event fc = equals pq(0). The 
second event, conditioned on A, has the probability density pw^i^i^). 
By general principles (cf. ^8.3p we have 

(8.8) E(||Mtf) = E ( E (llMtf)), 

where p^n is the pushforward density of pv with respect to p2 o ^ : F — )• S" 
and pW(; denotes a "conditional density" on the fiber VF^. This conditional 
density turns out to be of the form 

(8.9) Pw^iM) = c^^ • det{MM*)pw^{M), 

{C(^ denoting a normalization factor). In the case C = (1)0, ... ,0) we can 
identify with C"^" and pw,; takes the form (|7.ip studied in Section [71 
Proposition 17.11 and unitary invariance imply that for all ^ G 



(8.10) E (llMtf) < 

This implies by ([M]) that 



e(n + 1) 

2C72 



and completes the outline of the proof of Theorem [ 

The formal proof of the stated facts (|8.7p - (|8.9p is quite involved and will 
be given in the remainder of this section. 

8.3. Coarea formula. We begin by recalling the coarea formula that tells 
us how probability distributions on Riemannian manifolds transform. 

Suppose that X, Y are Riemannian manifolds of dimensions m, n, respec- 
tively such that m>n. Let (p: X he differentiable. By definition, the 
derivative Dip{x) : T^X — )■ T^^^^)^ at a regular point x £ X is surjective. 
Hence the restriction of D(p{x) to the orthogonal complement of its kernel 
yields a linear isomorphism. The absolute value of its determinant is called 
the normal Jacobian of (/? at x and denoted NJip{x). We set N3ip{x) := if 
X is not a regular point. We note that the fiber Fy := ip~^{y) is a Riemann- 
ian submanifold of X of dimension m — n if ?/ is a regular value of ip. Sard's 
lemma states that almost all y G y are regular values. 
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The following result is the coarea formula, sometimes also called Fubini's 
Theorem for Riemannian manifolds. A proof can be found e.g., in 
Appendix]. 

Proposition 8.3. Suppose that X, Y are Riemannian manifolds of dimen- 
sions m, n, respectively, and let f. X ^ Y be a surjective differentiahle 
map. Put Fy = ip~^{y). Then we have for any function x- X ^ M. that is 
integrable with respect to the volume measure of X that 

□ 




Now suppose that we are in the situation described in the statement of 
Proposition 18.31 and we have a probability measure on X with density px- 
For a regular value y £Y we set 

(8,11) = X^i^.ii',. 

The coarea formula implies that for all measurable sets i? C y we have 



Px dX = Py dY. 

f-^{B) Jb 

Hence py is a probability density on Y. We call it the pushforward of px 
with respect to (p. 

For a regular value y ^Y and x £ Fy we define 

(8.12) p (x) = — ^^^M— . 

Clearly, this defines a probability density on Fy. The coarea formula implies 
that for all measurable functions x - ^ ^ 



XPxdX= I I xPFydFy ] pY{y)dY, 

X Jy&Y \JFy J 

provided the left-hand integral exists. Therefore, we can interpret pp^ as 
the density of the conditional distribution of x on the fiber Fy and briefly 
express the formula above in probabilistic terms as 

(8.13) E {x{x))= E ( E (x(x))). 

To put these formulas at use in our context, we must compute the normal 
Jacobians of some maps. 

8.4. Normal Jacobians. We start with a general comment. Note that 
the M- linear map C — )• C, 2 i— )• A2: with A S C has determinant |Ap. More 
generally, let ip be an endomorphism of a finite dimensional complex vector 
space. Then | det equals the determinant of (/J, seen as a M-linear map. 

We describe now the normal Jacobian of the projection pi: W ^ .Ji 
following [23]. 
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Lemma 8.4. VFe /laue NJpi(M, C) = nr=i(l + ^) ^ where ai, ... ,an are 

the singular values of M. 

Proof. First note that T^S" = {C G C"+^ | Re(C,C> = 0}. The tangent space 
T{m,qW consists of the (M, () € ^ x such that MC + MC = 0. 

By unitary invariance we may assume that C = (1, 0, . . . , 0). Then the first 
column of M vanishes and we denote by ^ = [rrijj] G ({^"X" remaining 
part of M. W.l.o.g. we may assume that A is invertible. Further, let ii G 
denote the first column of M and A £ ([^"X" j^g remaining part. We may 
thus identify T(^m,(:)W with the product E x C"^" via (M,C) ^ {{u,C),A), 
where E denotes the subspace 

E := |('u,C) G C" X C"+^ I Ui + ^mijCj = 0, 1 < i < n, Co G 

We also note that E ~ graph(— A) x zM. The derivative of pi is described 
by the following commutative diagram 

T{M,oW ^ (graph(-^) x iM) x C"^" 



Dpi{M,0 



prxid 

C" X C"^ 



where pr(tt, Q) = ii. Using the singular value decomposition we may assume 
that A = diag((Ti, . . . ,ct„). Then the pseudoinverse of the projection pr is 
given by the R-linear map 

: C" — )• graph(— A), it i— )• (n, —a^'^ui, . . . , —a~^Un)- 

It is easy to see that det ip = nr=i(^ + "^i"^)- complete the proof we note 
that 1 /NJpi (M, C) = det V?. □ 

We have already seen that the condition number /inorm(Q'5C) can be de- 
scribed in terms of the map ^ introduced in (j8.3p . As a stepping stone 
towards the analysis of the normal Jacobian of ^ we introduce now the 
related bundle map 

whose normal Jacobian turns out to be constant. (This crucial observation 
is due to Beltran and Pardo in [6J.) 

Proposition 8.5. We have NJ$(g,C) = V for all {q,C) G V. 

Proof. By unitary invariance we may assume without loss of generality that 
C = (1,0,..., 0). If we write N = (riij) = Dq[C,) G ^ we must have njo = 
since iVC = 0. Moreover, according to the orthogonal decomposition (|8.ip 
and Lemma |8.H we have for 1 <i <n 

n 

qi = Xq''^ ^ n^Xj + hi 
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for some h = {hi, . . . ,hn) G i?^. We further express qi £ TqH^. = Hd as 

n 

i=i 

in terms of the coordinates n = (■Uj) G C", j4 = (dij) G C"^", and h = 
(hi) G R(^. The reason to put the factor \/di here is that 

(8.14) ||4||2 = ^|^^|2+^|^^^.|2 + ^||/,^||2 

i ij i 

by the definition of the Bombieri-Weyl inner product. 

The tangent space Tf^q^^-^V consists of the (g, C) £ x T^S" such that 
q{C) + NC, = 0, see [8l §10.3, Prop. 1]. This condition can be expressed in 
coordinates as 

n 

(8.15) Mi + ^nijCj = 0, i = l,...,n. 

i=i 

By (j8.14p the inner product on T(^q^QV is given by the standard inner product 
in the chosen coordinates Ui,dijXj if = 0. Thinking of the description 
of r(jv,()W^ given in the proof of Lemma 18.41 we may therefore isometrically 
identify T^^^^qV with the product T(^j\i^qW x ii^ via {q,C) ^ {{u, AX),h). 
The derivative of vri is then described by the commutative diagram 



Dpi(Af,C)xid 



Tid X R^. 

We shah next calculate the derivative of For this, we will use the 
shorthand dkq for the partial derivative dx,,q, etc. A short calculation yields, 
for j > 0, 



^.17) doqiiC) = diUi, djqiiC) = y/dldij, dhqi{C) = {di - I) 



Similarly, we obtain doqi{C) = and djqi{() = riij for j > 0. 

The derivative of D^{q, () : T^^g^^V — )• T^^j^^qW is determined by 

DHq, cm C) = {N, C), where N = Dq{C) + D\{0{C, •)• 
Introducing the coordinates N = {fiij) this can be written as 

n 

(8.18) hij = d.qiiC) + Yl ^lfc^^(^) 

k=l 

For j > this gives, using (I8.17p . 

n 

(8.19) iiij = y^idij + ^ d];,q,{C) Ck- 

k=l 
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For j = we obtain from ([HH]) . using IKT7\} and (f8l3]l . 

n n 

(8.20) Tijo = doQiiC) + ^ 9ofcgi(C) Cfc = diUi + (dj - 1) ^ Cfc = 'Ui- 

fc=l k=l 

Note the crucial cancellation taking place here! 

From (I8.19P and (I8.20p we see that the kernel K of D^{q, C) is determined 
by the conditions ( = 0,u = 0, A = 0. Hence, recalling T(g ~ T^^j^^qW x 
R(^, we have ~ x ii^ and K-^ ~ T(jv,()W^ x 0. Moreover, as in the proof 
of Lemma 18.41 (but replacing M by N) we write 

E ■= !{u, C) G C" X C"+^ I + ^nijCj = 0, 1 < z < n, Co e *| 

i=i 

and identify T(jy,^)W^ with x C"^". Using this identification of spaces, 
(j8.19p and (I8.20p imply that D^{q,()^± has the following structure: 

L'$(g,C)xi: ^ X C"''" ^ S x 

((u,C),i) ^ ((n,C),A(i)+p(C)), 
where the linear map A: C"^" — > C"^",A i-^ (v^aij), multiplies the ith 
rowofiwith and/o: C"+i ^ C"^" is given by p(C)ij = ELi^ife^i(C)4- 
By definition we have NJ$(g, Q = \ det D^{q, C)\k^ I • The triangular form 
of D^{q, C)k^ shown above implies that | det D^{q, C)\kA = det A. Finally, 

using the diagonal form of A, we obtain det A = Y17=i ~ which 
completes the proof. □ 

Remark 8.6. An inspection of the proof of Proposition 18.51 reveals that the 
second order derivatives occuring in do not have any impact on the nor- 
mal Jacobian NJ$. Its value occurs as a result of the chosen Bombieri- 
Weyl inner product on Ti^. With respect to the naive inner product on Tifi 
(where the monomials form an orthonormal basis), the normal Jacobian of 
$ at {q, C) would be equal to one at C = (l, 0, . . . , 0). However unitary in- 
variance would not hold and the normal Jacobian would take different values 
elsewhere. 

Before proceding we note the following consequence of Equation (j8.16p : 

(8.21) NJ7ri(g, C) = NJpi{N, () where = Dq{C). 

The normal Jacobian of the map ^ : V ^ W is not constant and takes a 
more complicated form in terms of the normal Jacobians of the projection 
pi'. W ^ ^ . For obtaining an expression for NJ^' we need the following 
lemma. 

Lemma 8.7. The scaling map W ^ 1^, (A^, C) ^ {M,C) with M = 
A^^N of rank n satisfies 

1 NJpi(Ar,C) 



det D-f{N, C) 



pn+i NJpi(M,C)' 
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Proof. If Wp denotes the solution variety in ^ x P" analogous to W we 
have T(jv/,^)W^ = 7'(m,c)^ip' ® ^iC- Let Pi : Wp — )■ ^ denote the projection. 
The derivative Djp{N,() of the corresponding scaling map 7p: Wp — )• Wp 
is determined by the commutative diagram 



Dp[iN,0 



Dp[(M,0 



> 

where the vertical arrows are linear isomorphisms. The assertion follows 
by observing that NJpi(A^,C) = det Z)p;(iV, C), NJ7(iV,C) = det L»7p(iV, C), 
and using that the R-linear map sc: ^ — )■ i— t- M = A^-'^A^ has the 
determinant 1/2?"+-^. □ 

Proposition 18.51 combined with Lemma 18.71 immediatelv gives 
for N = Dq{C), M = A'^N. 

8.5. Induced probability distributions. By Bezout's theorem, the fiber 
V{q) of the projection tti : F — )• at q £ Ha is a disjoint union of P = 
di • • • dn unit circles and therefore has the volume 27rP, provided q does not 
lie in the discriminant variety. 

Recall that p-^^ denotes the density of the Gaussian distribution A''(^, cj^I) 
for fixed q G 71^ and o" > and E-^d stands for expectation taken with 
respect to that density. We associate with py^^ the function py V ^ M. 
defined by 

(8.23) pv{qX) ■■= ^PWd(9)NJvri(g,C). 

The next result shows that py is the probability density function of the 
distribution on V we described in ^8.2[ 



Lemma 8.8. (1) The function py is a probability density on V . 

(2) The expectation of a function ip: V ^ M with respect to py can be 
expressed as Eyi^f) = Enai^av), where 

V^avCg) := TT^^ / ^dV{q). 



(3) The pushforward of py with respect to vri : V ^ equals p-^^ . 

(4) For q ^Ti, the conditional density on the fiber V{q) is the density of 
the uniform distribution on V{q). 

(5) The probability density pst on Vp introduced in ^3.2\ is obtained from 
the density py in the case q = 0, a = 1 as the pushforward under 
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the canonical map V — )• Vf, (/, () i— )• (/, [(]). Explicitly, we have 

Pst(g, [C]) = ^^e-^NI'NJvri(g,C). 

Proof. The coarea formula (Propositfon l8.3p applied to tti : F — )• Hd implies 

[ ippvdV = [ ([ ^{q,C)-^y^^^ dV{q))dnd 
Jv Jqe-Hd^JceViq) NJ7ri(g,C) / 



gen 



d 



Taking ip = 1 reveals that pv is a density, proving the first assertion. The 
above formula also shows the second assertion. 

By Equation (18. lip the pushforward density p of pv with respect to vri 
satisfies 

P^^^= I jyyi dV{q) = pnM). 
JceV{q) NJ7ri(g,C) 

as J dV{q) = 2ttT>. This shows the third assertion. By (|8.12p the conditional 
density satisfies 

which shows the fourth assertion. The fifth assertion is trivial. □ 

We can now determine the various probability distributions induced by pv ■ 
Proposition 8.9. We have 

{9 MX + hX) = Pw{M,C) ■ pRcih), 
where the pushforward density p\Y of py with respect to ^ : V ^ W satisfies 

pw{M,C) = ^Pc,{0)- pw, (M) • N Jpi (M, C) . 

Proof. Using the factorization of Gaussians (|8.5p and Equation (j8.2ip . the 
density pv can be written as 

Pv{9M,c + K0 = ^Pc.iO) Pw,{M)pR^{h)NJpi{N,C), 
where N = AM. It follows from ([8:22]) that 



(8.24) -g_(g,,^^ + /j,C) = ^pc,iO)pwciM)pR,ih)mp,{M,C). 
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This implies, using (18. lip for : V ^ W and the isometry ^'"^(M, C) ~ 
for the fiber at that 



PwiM,0 = j^^^ ^{gM,Q + h,OdR^ 

= ^ PC, (0) • pw, (M) ■ NJpi (M, C) / PR, {h)dR^ 



= ^PC,iO)-pw,iM)-mpiiM,C) 

as claimed. Replacing in (j8.24p we therefore obtain 
PV 



{gM,c + h,C) = Pw{M,C) PnAh). □ 



The claimed formula ()8.7p for the pushforward density of pw with 
respect to pi : W ^ ^ immediately follows from Proposition 18.91 by inte- 
grating over the fibers of pi. 

Lemma 8.10. Lti denote the expectation of det(MM*) with respect 
to pw,- We have 

^^{MX) = PS-{C)-PW,{M), 

where ps"{() = ^Pc^i^) is the pushforward density of pw with respect to 
P2: W ^ S", and where the conditional density p\Y^ on the fiber Wc^ 0/P2 
is given by 

pw,{M) = • det{MM*)pw,iM). 
Proof. In [23] (see also [8, Section 13.2, Lemmas 2-3]) it is shown that 

(8.25) ^(M,C) = det(MM*). 

Combining this with Proposition 18.91 we get 

^(M, C) = ^ PCc(O) • Pw.iM) . det(MM*). 

Integrating over we get p§n[() = PCj(O) • c^, and finally (cf. ()8.12p ) 



^'^c W = (Aku (m n = • '^^c W • det(MM*) 
^ PS"(C)NJp2(M,C) ^ 

as claimed. □ 



This lemma shows that the conditional density pw, has the form stated 
in (j8.9p and therefore completes the proof of Theorem 13.61 
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8.6. Expected number of real zeros. As a further illustration of the 
interplay of Gaussians with the coarea formula in our setting, we give a 
simplified proof of one of the main results of [23]. This subsection is not 
needed for understanding the remainder of the paper. 

Our developments so far took place over the complex numbers C, but 
much of what has been said carries over the situation over R. However, 
we note that algorithm ALH would not work over M since the lifting of the 
segment Ef^g will likely contain a multiple zero (over C this happens with 
probability zero since the real codimension of the discriminant variety equals 
two). 

Let T-ld,M. denote the space of real polynomial systems in Hd endowed with 
the Bombieri-Weyl inner product. The standard Gaussian distribution on 
T~id,R is well-defined and we denote its density with pUdw 

Corollary 8.11. The average number of zeros of a standard Gaussian ran- 
dom f G ?^d,M the real projective space P"(M) equals \fT>. 

Proof. Let xiQ) denote the number of real zeros in F"(M) of g G ^d,R- Thus 
the number of real zeros in the sphere 5" = 5(M"+^) equals 2x(g). The 
real solution variety Vm. C 'Hd,R x 5" is defined in the obvious way and so is 
Wr C X 5", where = R"x("+i). 

The same proof as for Proposition 18.51 shows that the normal Jacobian 
of the map <&]k: 1^ — > Wr, {q,C) '-^ {Dq{C),C) has the constant value 2?"/^ 
(the 2 in the exponent due to the considerations opening ^8.4p . 

Applying the coarea formula to the projection vri : Vr — 7id,R yields 

/ XPHdtidnd,R = I Py.dM) \ I d-^i\Q)dnd,R 

\p'Ha,R NJvri dVR. 

We can factor the standard Gaussian p-^^ into standard Gaussian densi- 
ties pc^ and pl^ on and L^, respectively, as it was done in ^8.51 over C 
(denoting them by the same symbol will not cause any confusion). We also 
have an isometry — )• as in (j8.2p and pi^ induces the standard Gaussian 
density pw^ on Wi^. The fiber of <^ir : Wr, {q, () i-> (iV, C) over {N, () 

has the form <^>-^{N,C) = {{gM,c + h,0 I h £ where M = A'^N, 
cf. Lemma 18. 2[ We therefore have ^ (5m,c + h) = pc^ (0) pv^c (-^) PRt; {h) ■ 

The coarea formula applied to <I>]r: Vr — )■ Wr, using Equation 18. 211 yields 



^PWd.itNJvri dVR 



1 



2NJ^>]i 
1 

2NJ$if 



Pc.iO) pw,{M)N.lpi{NX) [ PR,{h)dR^dW^ 



PC, (0) pw, (M) N Jpi (iV, OdWu. 
(Af,C)eVl/M 
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Applying the coarea formula to the projection pi : Wm. — ?• -#r, we can sim- 
plify the above to 

^ ^ PC. (0) pw. (M) l [ dp^\N) 



NJ$R JN&^jt 2 
^ / pc^iO) pw^{M) d^K 



_ n+1 

PCciO) Pw^{M) d^u, 

Me^E 

where the last equality is due to the change of variables — )• N ^ M 

_ n + 1 

that has the Jacobian determinant D 2 . Now we note that 
pc,{0) ■ pw,m = (2^)-"/2 (27r)-"V2 exp ( - i||M| 

is the density of the standard Gaussian distribution 

that the last integral (over M G ^r) equals one. Altogether, we obtain, 
using NJ$M = V/^, 

_ n+1 

9. Effective Sampling in the Solution Variety 

We turn now to the question of effective sampling in the solution variety 
endowed with the measure pst introduced in §3.21 The goal is to provide the 
proof of Proposition 13.31 

Proposition 9.1. In the setting of H8.5\ suppose q = 0,a = 1. Then the 
pushforward density pj( of pw with respect to pi: W ^ M equals the stan- 
dard Gaussian distribution in ^ . The conditional distributions on the fibers 
of pi are uniform distributions on unit circles. Finally, the conditional dis- 
tribution on the fibers of^: V ^ W is induced from the standard Gaussian 
in via the isometry (j8.2p . 

Proof. Since p-^^ is standard Gaussian, the induced distributions on C(^,L(^, 
and are standard Gaussian as well. Hence pw^^ equals the standard 
Gaussian distribution on the fiber W^. Moreover, /oq(0) = (v^27r)~^". 
Equation (j8.7p implies that 

Mfp 



pMM) = PCc(O) • PW,{M) = (27r)-" (27r)-"' exp ( - i| 



which equals the density of the standard Gaussian distribution on 
Lemma 18.101 combined with (I8.25P gives 

-(M, = ^ pcM ■ pwAM) = ^ P.AM). 



NJpi 21:' ''^'' ''" ' 27r 

Hence the conditional distributions on the fibers of p\ are uniform. (Note 
that this is not true in the case of nonstandard Gaussians.) The assertion on 
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the conditional distributions on the fibers of ^ fohows from Proposition 18.91 



Proof of Proposition \3.3[ Proposition 19 . 1 1 (combined with Lemma fS.Sp shows 
that the fohowing procedure generates the distribution pst- 

(1) Choose M € ^ from the standard Gaussian distribution (almost 
surely M has rank n), 

(2) compute the unique [(] G P** such that M( = 0, 

(3) choose a representative (" uniformly at random in S", 

(4) compute gM,(;, cf. (|8.2p . 

(5) choose h € from the standard Gaussian distribution, 

(6) compute q = qmx + ^ ^'^d return {q, [(]). 

An elegant way of choosing h in step 5 is to draw / G 71^ from A^(0, 1) 
and then to compute the image h of f under the orthogonal projection 
Tif^ — )■ R(^. Since the orthogonal projection of a standard Gaussian is a 
standard Gaussian, this amounts to draw h from a standard Gaussian in i?^. 
For computing the projection h we note that the orthogonal decomposition 
f = k + gM^( + h with k £ Cq, M = [rriij] G and h £ Rq\s obtained as 



(Recall DgMxiC) = AM and note ^(X,C)'^'(C) = diCj.) 

It is easy to check that 0{N) samples from the standard Gaussian dis- 
tribution on M are sufficient for implementing this procedure. As for the 
operation count: step (4) turns out to be the most expensive one and can 
be done, e.g., as follows. Suppose that all the coefficients of (X, have 
already been computed. Then each coefficient of {X, C,)^ = {XqQq + • • • + 
C)'^^^ can be obtained by 0{n) arithmetic operations, hence all 
the coefficients of (X, Q)^ are obtained with C'(n("^'^)) operations. It follows 
that {X,0'^' can be computed with 0{dinNi) operations, hence 0{DnN) 
operations suffice for the computation of gM,c,- It is clear that this is also an 
upper bound on the cost of computing {q, Q). □ 



We provide now the proof of the remaining results stated in Section [3l 
The next two cases we wish to analyze (the condition-based analysis of LV 
and a solution for Smale's 17th problem with moderate degrees) share the 
feature that one endpoint of the homotopy segment is fixed, not randomized. 
This sharing actually allows one to derive both corresponding results (Theo- 
rems [321 and ESI respectively) as a consequence of the following statement. 



□ 



niij 
h 



di'^^{^xJ^{C)-d^f^{C)Ci) 

f -k- gM,c- 
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Theorem 10.1. For g £ S{nd) \ S we have 



E 

/eS{Wd) 



ds{f,g) f^l{qr)dT^ < 72A D'/^N{n + l)^Lx(5) + 0.01. 



The idea to prove Theorem 110.11 is simple. For small values of r the 
system q-r is close to g and therefore, the value of /^2('?t") can be bounded by 
a small multiple of /imax(S')- For the remaining values of r, the corresponding 
t = t^r) is bounded away from and therefore so is the variance (j| in the 
distribution N(q^,a^I) for qt- This allows one to control the denominator 
in the right-hand side of Theorem 13.61 when using this result. Here are the 
precise details. 

In the following fix g G S{T-L(i) \ S. First note that we may again replace 
the uniform distribution of / on S'('Hd) by the truncated Gaussian Na{0, I). 
As before we chose A := V2N. We therefore need to bound the quantity 

Qg ■■= E (ds{f,g) [ fJ.l{qr)dT\ 
/~V^(0,I) \ Jo J 

To simplify notation, we set as before e = 0.13, C = 0.025, A = 7.53 • 10^^, 
and define 

_ A 1 

° ^^/VLx(5) ' ^" 1 + A + 1.00001 A • 

Proposition 10.2. We have 

< (1 + e)% /.Lx(^) + 7^ /' IE ($^) dt, 

Pa,i JtAqtr^N(q„tH) V mtV J 

where = (1 — t)g. 

Proof. Let C^^\...,C^^^ be the zeros of g and denote by (^rj Cr"''')re[o,i] the 
lifting of Ef^g in V corresponding to the initial pair {g,C^''^) st-nd final sys- 
tem f GTidX^- 

Equation ()4.ip for i = in the proof of Theorem 13. II shows the following: 
for all 7 and all r < , ,^ x„^/o'^9 — , we have 

norm 

In particular, this inequality holds for all j and all r < ^^^^j!^^ and hence, for 
all such r, we have 

(10.1) /i2(gr) < (l+e)/Umax(5)- 

Splitting the integral in Qg at to(/) := min {l, ^^^^} we obtain 
Qg = E [ds{f,g) / nl{qr)dT) 



+ E (ds{f,g) [ fil{qr)dT 

/~AfA(0,I) V Jroif) 
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Using (jlO.ip we bound the first term in the right-hand side as follows, 

E {ds{f,g) fJ-Hqr) dr) < [1 + ef 6on,n^^{gf . 

/-ataCo,!) ^ Jo ' 

To bound the second term, we w.lo.g. assume that to(/) < 1. We apply 
Proposition 15.21 to obtain, for a fixed /, 



where to(/) is given by 

^o(/) = 1 I II f\\( ■ — V' 0!-=ds{f,g)- 

1 + ||/||(smacotoo — cosq) 

Now note that ||/|| < A since we draw / from Na{0,^)- This will allow us 
to bound to(/) from below by a quantity independent of /. For ||/|| < A we 
have 

1 1 

< sin a cot 5o — cos a < r cos a < r — h 1 

sm oq sm oq 

and moreover, sm6o > 0.99999 (5o since 6o < 2-^/2 a < 0.00267. We can 
therefore bound to(/) as 

toil) > A— > T = tA- 

^ - 1 + ^ + ~ 1 + ^ + 1.00001 # 

sin((5o) So 



We can now bound the second term in Qg as follows 

E (ds{f,9) r f^l{qr)dr)< E (aC^. 

f^NA{0,l) ^ Jroif) ' /~A^a(0,I) ^ JtAWmV 

JtA /~jva(o,i) V mr J Pa,i JtA /~Af(o,i) V ll^lr / 

To conclude, note that, for fixed t and when / is distributed following iV(0, 1), 
the variable qt = {I — t)g + tf follows the Gaussian N{q^,t'^l), where = 
{l-t)g. □ 

Proof of Theorem \10.1\ By homogeneity we can replace the uniform distri- 
bution on S{'Hd) by A^^(0,I), so that we only need to estimate Qg by the 
right-hand side of Proposition 110.21 In order to bound the first term there 
we note that 

(1 + efSo /iLx(5) = (1 + efXD-^/^ < (1 + efX < 0.01. 
For bounding the second term we apply Theorem 13.61 to deduce that 

JtA<lt~N(q„tH)^ mr ^ JtA 2 \tA J 

e(n + l)A/, 1.00001 \ 
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Replacing this bound in Proposition 110.21 we obtain 

eA'^(n + l)/ 1.00001 ^o/, , , A 

Qg < , + ^^-P'^VLx(g)J + 0-01 

X S/2 2 , J 1.00001 \ 

< 2eN{n + l)D^'^A..A9) [^-^ + J + O'^l 

< 724 7V(n + l)Z)3/2^Lx(5) + 0.01, 

where we used D > 2 for the last inequahty. □ 

10.1. Condition-based Analysis of LV (proof). 



Proof of Theorem 3.1 . The result follows immediately by combining Propo- 



sition 15.11 with Theorem 110. H with the roles of / and g swapped. □ 

10.2. The Complexity of a Deterministic Homotopy Continuation. 

We next prove Theorem 13.81 beginning with some general considerations. 
The unitary group lAin + 1) naturally acts on P" as well as on Ji^ via 
(z^, /) I— 7- / o . The following lemma results from the unitary invariance 
of our setting. The proof is immediate. 

Lemma 10.3. Lei g G Hd, C S P" &e a zero of g, and u € U{n + 1). 

Then UnormigX) = fJ-normig o z^"\z^C)- Moreover, for f G Tid, we have 
K{f,g,0=K{fov-\gov-\uC). □ 

Recall Ui = "^^(^o' ~ -^t^) ^"^^ denote by a djth primitive root of 
unity. The V zeros oi U = {Ui, . . . , C/„) are the points Zj = (l : z^j^ : . . . : 

■^(n)) ^ ^" possible tuples j = {ji, ■ ■ ■ ,jn) with ji G {0, . . . , — 1}. 

Clearly, each Zj can be obtained from Zi:=(l:l:...:l)bya unitary 
transformation I'j, which leaves U invariant, that is, 

UjZi = Zj, U o i^T^ = U. 

Hence Lemma [10.31 implies Hnorm{U, ^j) = Mnorm(t^)^i) for all j. In partic- 
ular, Hnia.AU) 

Proposition 10.4. Kjj{f) = K{f,U,zi) satisfies 

1 ^ - 
E %(/)= E -Y,Kif,U,z,). 

Proof. Lemma 110.31 implies for all j 

K{f,U,zi) = K{f o uj\Uovj\y^zi) = K{fouj\U,Zj). 
It follows that 

- 1 ^ - 

Kjjif) = Kif, U, zi) = -Y, K{f o vt\ U, z,). 



42 PETER BURGISSER AND FELIPE CUCKER 

The assertion follows now since, for all measurable functions 99: S{T-Ld) — >• K 
and all ly € l/({n + 1), we have 

E <^(/)= E fifoT^), 
due to the isotropy of the uniform measure on S{'Hd), D 



Lemma 10.5. We have 



Aimax(^^) < 2n max ^(n + 1)'^'"^ < 2 (n + 1)^. 



Proof. Recall /Uniax(C^) = fJ-normiU, Zi), SO it suffices to bound fJ-normiU , Zi). 

Consider M := diag((iP ||zi f-'^O Z)I7(zi) G cnx(n+i)_ gy (jefinition we 
have (cf. ^O]) 

/inorm(I^,^l) = ||t^||||Mt|| = ||Mt|| ^ 



where (Tmin(Af) denotes the smallest singular value of M. It can be charac- 
terized as a constrained minimization problem as follows: 

cr^in(M) = min ||Mn|p subject to u G (kerM)-^, = 1. 

u 

In our situation, ker M = C(l, . . . , 1) and DU{zi) is given by the following 
matrix, shown here for n = 3: 

DU{zi) = ^ 



di 


-di 








d2 





-d2 





d3 








-d3 



Hence for u = (uq, • • • , Un) G C^^^, 



= — > 7-3 — r-\ui — ur)\ > — mm^ 7-3 — 7 - > n,- — un . 

i=i ^ ^ ^ ' ^ j=i 

A straightforward calculation shows that 

n n n 

^iui-uoi^>i if y^ui = 0, [uip = 1. 

i=l 1=0 i=0 

The assertion follows by combining these observations. □ 

Proof of Theorem VJ.lA Equation (j5.1|) in the proof of Proposition EH] implies 
iov g = U that 

-Y,K{f,U,z,)< 217 D'/^ds{f,U) / ^^(g.)dr. 

Using Proposition 110.41 we get 

E K^if)< 217 D^/^ E (ds{f,U) [ f^Uqr) dr 
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Applying Theorem 110.11 with g = U we obtain 

E Kjjif) < 2171)3/2 (724 D^/^N{n + l)fxl^(U) + 0.01). 

/65(Hd) 

We now plug in the bound ^ma^iU)"^ < 2{n + 1)^ of Lemma [10.51 to obtain 
E Kjjif) < 3142161)3 Af(n + 1)^+^ + 2.17 L>3/2. 

This is bounded from above by 314217 N{n + 1)^'^^, which completes 
the proof. □ 

11. A NEAR SOLUTION TO SmALE'S 17tH PROBLEM 

We finally proceed with the proof of Theorem 13.91 The algorithm we will 
exhibit uses different routines for D < n and D > n. Our exposition reflects 
this structure. 



11.1. The case D < n. Theorem 13.81 bounds the number of iterations of 
Algorithm MD as 

E Kjj{f) = 0{D^Nn^+^). 

For comparing the order of magnitude of this upper bound to the input 
size = Yl'i=i (" n*^') need the following technical lemma (which will be 
useful for the case D > n as well). 



Lemma 11.1. (1) For D < n, n > 4, we have 

" -U j ■ 

(2) For > ri > 1 we have 

fn + D\ 
In n < 2 In In ( ) + ^' 

(3) For < c < 1 there exists K such that for all n, D 

D < n^-" ^ < 

(4) For D < n we have 

< ^21nlnAf+Cl(l)_ 

(5) For n < D we have 

jjn ^ ^21nlnAf+C'(l) 



n + U 



n 
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Proof. Stirling's formula states n! = -v/27rn""'"2 e~"e^ with < < 1- Let 
H{x) = X In ^ + (1 — x) In denote the binary entropy function, defined for 
< X < 1. By a straightforward calculation we get from Stirling's formula 
the following asymptotics for the binomial coefficient: for any < m < n 
we have 

11.1 In ] =nH(-) + -In— - 1 + en,m, 

where —0.1 < en,m < 0.2. This formula holds as well for the extension of 
binomial coefficients on which m is not necessarily integer. 

(1) The first claim is equivalent to < ("jj^)- The latter is easily 
checked for D S {1, 2, 3} and n > 4. So assume n > D > 4. By monotonicity 
it suffices to show that < ( ^ ) for Z) > 4. Equation (jll.ip implies 

, /2D\ , 1 2 

ln^^j>2i^ln2 + -ln--l.l 

and the right-hand side is easily checked to be at least D, for D > 4. 

(2) Put m := V^- li D > m then ("+^) > (""^i™^), so it is enough to 
show that Inn < 2 In In ("^^'"^) + 4. Equation (jll.ip implies 

In ' ' >ln ]>(n + m)H{ + - In 1.1. 

\ n J \ n J \n + m/ 2 m 

The entropy function can be bounded as 

HI > In IH > Inm. 



'.n + m/ n + m \ mJ n + m 

It follows that 

^ fn+\m\\ I ^ , 1, 1^, 

In \ > —Jninn Inn— 1.1>— vralnn 

Vny~2 4 ~4 

the right-hand inequality holding for n > 10. Hence 

/'n+ [ml \ 1 1 
In In I I > — In n -|- In In n — In 4 > — In n — 2, 

V n y - 2 - 2 

the right-hand inequality holding for n > 2. This shows the second claim 
for n > 10. The cases n < 9 are easily directly checked. 
(3) Writing D = n5 we obtain from Equation (jll.ip 

l„(";")=(„ + «)/,(^)-llnD + 0(l). 

Estimating the entropy function yields 

Hi > -In 1 -h - > - In - = — Inn, 
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where e is defined hy 5 = n ^ . By assumption, e > c. From the last two 
lines we get 

1 fn + D\ c 1-c I 



Dliin \ n J 2 2D ylnn^ 
In the case c < | we have D > n^/^ and we bound the above by 



2 2ni/4 \lnnj' 

which is greater than c/4 for sufficiently large n. In the case c > | we bound 
as follows 

1 , /n + D\ c 1-c 1 \ 1 / 1 \ 1 
In >- — + Oi- ]=c-- + 0(- > 



Dlnn \ n J 2 2 \lnn J 2 \lnn/ 5 

for sufficiently large n. 

We have shown that for < c < 1 there exists tic such that for n > ric, 
D < n^^'^, we have 

„-<("+°^"- 

where Kc := max{4/c, 5}. By increasing Kc we can achieve that the above 
inquality holds for all n, D with D < v}"^. 

(4) Clearly, N > ("+^) . If D < y/n then, by part (3), there exists K such 
that 

„-<(" + °)"<A.- 

Otherwise D G [\/n, n] and the desired inequality is an immediate conse- 
quence of parts (1) and (2). 

(5) Use ("^'^) = ("l)^) and swap the roles of n and D in part (4) above. 

□ 

Theorem 13.81 combined with Lemma 111.1( 4) implies that 

(11.2) EKTrf/) =iV^^°'''^+^(^^ iiD<n. 
f 

Note that this bound is nearly polynomial in N. Moreover, if D < n^~'^ for 
some fixed < c < 1, then Lemma lll.ll f3) implies 

(11.3) eKtt(/) = iV^W. 

/ 

In this case, the expected running time is polynomially bounded in the input 
size N. 
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11.2. The case D > n. The homotopy continuation algorithm MD is not 
efficient for large degrees — the main problem being that we do not know 
how to deterministically compute a starting system g with small Hmnx{g)- 
However, it turns out that an algorithm due to Jim Renegar [18], based on 
the factorization of the li-resultant, computes approximate zeros and is fast 
for large degrees. 

Before giving the specification of Renegar's algorithm, we need to fix some 
notation. We identify F'^ := {{xq : ■ ■ ■ : Xn) S P" | 7^ 0} with C" via 
the bijection (xq : • • • : Xn) ^ {xi/xq, . . . ,x„/xo). By ||x||aff we denote the 
Euclidean norm of x G Pq, i.e.. 



and we put ||x||afr = 00 if x € P" \ Pq . An elementary argument shows that 
d¥{x, y) < \\x - 2/||aff for x,y £ C". 

By a 5 -approximation of a zero C E C" of / G Tid we understand an x G C" 
such that ||x — ClUff < The following result relates (^-approximations to 
the approximate zeros in the sense of Definition 12. 1[ 

Proposition 11.2. Let x be a 6 -approximation of a zero ( of f . Recall 
C = 0.025. IfD^/^fi norraifj x)6 < C, then X is an approximate zero of f. 

Proof. We have dp{x,C,) < ll^; — ClUff < ^- Suppose that D^^"^ ^^^^^{f ,x)5 < 
C. Then, by Proposition 14.11 with g = f, we have /Xnorm(/)C) < (1 + 
e)/^nomi(/) a;) with e = 0.13. Hence 



Consider now R > 6 > 0. Renegar's Algorithm Ren(R,5) from [TB] takes 
as input / G Ti^ , decides whether its zero set V{f) C P'^ is finite, and 
if so, computes (5-approximations x to at least all zeros of / satisfying 
llCllaff ^ R- (The algorithm even finds the multiplicities of those zeros see 
|18j for the precise statement.) 

Renegar's Algorithm can be formulated in the BSS-model over M. Its 
running time on input / (the number of arithmetic operations and inequality 
tests) is bounded by 



To find an approximate zero of / we may use Ren(i?, 5) together with Propo- 
sition [TL2] and iterate with R = and 6 = 2^^ for A; = 1, 2, . . . until we are 
successful. More precisely, we consider the following algorithm: 




l)'/Vnorm(/, C)dp{x, C) < (1 + e)Z?'/Vnorm(/, x)6 < (1 + e)C. 

We have {I + e)C < uq = 3 - V7. Now use Theorem EZl 



□ 



(11.4) 
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Algorithm ItRen 
input / G T-Ld 
for k = 1,2, . . . do 

run Re(#,2"'^) on input / 

for all (^-approximations x found 

if D^/'^ Unormif , x)6 < C stop and RETURN x 



Let So := S U {/ G "Hd I VU) n = 0}. It is obvious that ItRen stops 
on inputs / Sq. In particular, ItRen stops almost surely. 

The next result bounds the probability Probfail that the main loop of 
ItRen, with parameters R and 5, fails to output an approximate zero for a 
standard Gaussian input / € Ha (and given R,5). We postpone its proof 
to glOl 

Lemma 11.3. We have Probfail = 0{n^N'^D^V5'^ + nR-"^). 

Let T{f) denote the running time of algorithm ItRen on input /. 

Proposition 11.4. We have for standard Gaussian f G Tid 

Er(/) = (nNVf^'l 
f 

Proof. The probability that ItRen stops in the {k + l)th loop is bounded 
above by the probability that Re(4'^, 2~^) fails to produce an approximate 
zero. Lemma 111.31 tells us that 

Pk = 0{n^N^D^V 16-''). 

If Ak denotes the running time of the (k + l)th loop we conclude 

oo 

Er(/) < y^AkPk. 

According to (lll.4p . is bounded by 

O (nV\log V){log k) + r?V^ ^ + {N + r?)V 



n 

where the last term accounts for the cost of the tests. The assertion now fol- 
lows by distributing the products A^Pk and using that the series "^^kyi 16"*^, 
and X]fc>i 16"^^ log k have finite sums. □ 

Proof of Theorem \3.!A We use Algorithm MD if D < n and Algorithm ItRen 
if D > n. We have already shown (see (jll.2p . (jll.Sp ) that the assertion holds 
if D < n. For the case D > n we use Proposition 111.41 together with the 
inequality V^^^'^ < D^^"-^ < ivC(i°gi°g^) which follows from LemmaEHS). 
Moreover, in the case D > n^+^. Lemma [IlH3) implies P < < iV'^(i). 

□ 
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11.3. Proof of Lemma 111.31 Let £ denote the set of / E such that 
there is an x on the output hst of Ren(iZ, 5) on input / that satisfies C < 
Dy^l2^,,^{f,x)6. Then 

Probfail < Prob ( min ||C||aff > i?) + Prob-S. 
/eWd I- Cev(/) J 

Lemma 111.31 follows immediately from the following two results. 
Lemma 11.5. For R > and standard Gaussian f G we have 
Pr„b{ mm ||C|U>ii}<^. 

Proof. Choose / G Tid standard Gaussian and pick one of the D zeros 
Cf^\ ■ ■ ■ tC^j^^ of / uniformly at random, call it (,. Then the resulting dis- 
tribution of (/, C) in has the density pst- Lemma 18.81 implies that C is 
uniformly distributed in P". Therefore, 

Prob { min ||cf ||aff > R} < Prob { ||C||afr > i?}- 

To estimate the right-hand side probability we observe that 

||Cl|afr>i?^*(C,IP"'')<|-^, 

where 6 is defined by i? = tan0 and P"-^i := {x G P" | = 0}. Therefore, 

ProbdICIU, > R] = v°l{^^F-U,(..F-)<i-^} 

^gpn lll^llatt - / vol(P") 

Due to Lemma 2.1] and using vol(P") = 7r"/n!, this can be bounded by 



sinM - - 6* 1 = ncos^6l = — ^ < --tt. □ 



n n 



vol(P") \2 J l + B? - E?' 

Lemma 11.6. We have Prob£: = 0{n^N'^D^V5'^). 

Proof. Assume that f £ £. Then, there exist (^,x €¥q such that f{C) = 0, 
llCllaff < R, lie - a;||afr < 5, Ren returns x, and L>^/^//norm(/, > C. 

We proceed by cases. Suppose first that 6 < 3/2 — tftt- Then, by 
Proposition 14.11 

norm norm 

hence 

^max(/) > Ainorm(/,C) > {1 + s)'^ C D'^/^ 

If, on the other hand, 6 > 3/2 ^ — tttT' then we have 

^ ^ Mnorm (jiCj 

/Umax(/) > ^norm(/,C) > CD~^/^6-\ 

Therefore, for any f €z £, 

/xmax(/) > {l + e)-'CD-^/^6-' =: AoD-^/^6-\ 
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Theorem C of ^ states that Prob/{;Umax(/) > P^H = 0{n^N^Vp^) for 
ah p > 0. Therefore, we get 

Prob£: < Prob {/Umax(/) > AoD-'^/^5-^} = 0{n^N^VD^6^) 
as claimed. □ 
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Note added in proof. Since the posting of this manuscript on September 
2009, at larXiv : 0909 . 21 14l a number of references have been added to the 
hterature. The non constructive character of the main result in [21] — the 
bound in (jl.3p — had also been noticed by Carlos Beltran. In a recent paper 
("A continuation method to solve polynomial systems, and its complexity." 
Available at http://sites.google.com/site/beltranc/preprints, To 
appear in Numerische Mathematik) Beltran proves a very general construc- 
tive version of this result. Our Theorem 13. II can be seen as a particular case 
(with a correspondingly shorter proof) of Beltran's paper main result. We 
understand that yet another constructive version for the bound in (|1.3p is 
the subject of a paper in preparation by J.-P. Dedieu, G. Malajovich, and 
M. Shub. 

Also, Beltran and Pardo have recently rewritten their paper [6] ("Fast 
linear homotopy to find approximate zeros of polynomial systems." To ap- 
pear in Found. Comput. Math.). This revised version, which increases the 
length of the manuscript by a factor of about 4, adds considerable detail to 
a number of issues only briefly sketched in [6]. In particular, the effective 
sampling from the solution variety is now given a full description (which is 
slightly different to the one we give in Section (Oj) . 
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